We consider the problem of continually releasing an estimate of the population mean of a stream of samples that is user-level differentially private (DP). At each time instant, a user contributes a sample, and the users can arrive in arbitrary order. Until now these requirements of continual release and user-level privacy were considered in isolation. But, in practice, both these requirements come together as the users often contribute data repeatedly and multiple queries are made. We provide an algorithm that outputs a mean estimate at every time instant $t$ such that the overall release is user-level $\varepsilon$-DP and has the following error guarantee: Denoting by $M_t$ the maximum number of samples contributed by a user, as long as $\tilde{\Omega}(1/\varepsilon)$ users have $M_t/2$ samples each, the error at time $t$ is $\tilde{O}(1/\sqrt{t}+\sqrt{M}_t/t\varepsilon)$. This is a universal error guarantee which is valid for all arrival patterns of the users. Furthermore, it (almost) matches the existing lower bounds for the single-release setting at all time instants when users have contributed equal number of samples.
translated by 谷歌翻译
Model-Based Reinforcement Learning (RL) is widely believed to have the potential to improve sample efficiency by allowing an agent to synthesize large amounts of imagined experience. Experience Replay (ER) can be considered a simple kind of model, which has proved extremely effective at improving the stability and efficiency of deep RL. In principle, a learned parametric model could improve on ER by generalizing from real experience to augment the dataset with additional plausible experience. However, owing to the many design choices involved in empirically successful algorithms, it can be very hard to establish where the benefits are actually coming from. Here, we provide theoretical and empirical insight into when, and how, we can expect data generated by a learned model to be useful. First, we provide a general theorem motivating how learning a model as an intermediate step can narrow down the set of possible value functions more than learning a value function directly from data using the Bellman equation. Second, we provide an illustrative example showing empirically how a similar effect occurs in a more concrete setting with neural network function approximation. Finally, we provide extensive experiments showing the benefit of model-based learning for online RL in environments with combinatorial complexity, but factored structure that allows a learned model to generalize. In these experiments, we take care to control for other factors in order to isolate, insofar as possible, the benefit of using experience generated by a learned model relative to ER alone.
translated by 谷歌翻译
大型语言模型开发的最新进展导致公众访问最先进的预训练的语言模型(PLM),包括生成培训的预训练的变压器3(GPT-3)(GPT-3)和Transformers(来自Transformers)的双向编码器(伯特)。但是,实际上,对PLM的评估表明,在培训和开发的微调阶段,它们对对抗性攻击的敏感性。这种攻击可能导致错误的输出,模型生成的仇恨言论以及用户敏感信息的暴露。尽管现有的研究集中在PLM的培训或微调期间的对抗攻击上,但有关这两个发展阶段之间攻击的信息不足。在这项工作中,我们重点介绍了GPT-3公开发行的主要安全漏洞,并进一步研究了其他最先进的PLM中的这种漏洞。我们将工作限制在没有经过微调的预培训模型中。此外,我们强调了令牌距离最小化的扰动作为一种有效的对抗方法,绕过受监督和无监督的质量措施。遵循这种方法,在评估语义相似性时,我们观察到文本分类质量的显着降低。
translated by 谷歌翻译
机械模拟器是流行病学的必不可少的工具,可以在不同条件下探索复杂,动态感染的行为并导航不确定的环境。基于ODE的模型是能够快速模拟且可实现基于梯度的优化的主要范式,但可以简化有关人群同质性的假设。基于代理的模型(ABM)是一种越来越流行的替代范式,可以代表接触相互作用的异质性,并具有颗粒状细节和个人行为的代理。但是,常规的ABM框架没有可区分的,并且在可伸缩性方面提出了挑战。因此,将它们连接到辅助数据源是非平凡的。在本文中,我们介绍了GradABM,这是ABMS的新型可扩展,快速和可区分的设计。 GradABM在商品硬件上几秒钟内运行模拟,并启用快速前进和可区分的反向模拟。这使得可以与深度神经网络合并并无缝整合异质数据源以帮助校准,预测和政策评估。我们通过对实际Covid-19和流感数据集进行了广泛的实验来证明GradABM的功效。我们很乐观,这项工作将使ABM和AI社区更加紧密。
translated by 谷歌翻译
鉴于在特殊命令输入中编码的目标,目标条件的强化学习(RL)旨在学习最佳政策。在这里,我们研究了目标条件的神经网(NNS),这些神经网已经学会以特定于上下文特定的重量矩阵形式生成深度NN策略,类似于1990年代的快速体重程序员和其他方法。使用表单的上下文命令“生成实现预期回报的策略”,我们的NN生成器将对参数空间的强大探索与跨命令的概括相结合,以迭代地找到越来越更好的策略。体重共享的超级核武器和策略嵌入形式缩放了我们生成深度NN的方法。实验表明,单个学识渊博的政策生成器如何制定在培训过程中获得任何回报的政策。最后,我们在表现出竞争性能的一系列连续控制任务上评估了算法。我们的代码是公开的。
translated by 谷歌翻译
学习评估和改善政策是加强学习(RL)的核心问题。传统的RL算法学习为单个策略定义的值函数。最近探索的竞争选择是学习许多策略的单个价值功能。在这里,我们结合了基于参数的价值函数的参与者批判性架构和策略评估网络的策略嵌入,以学习评估(并从而有助于改善)的单个价值函数,以改善深度神经网络(NN)代表的任何策略。该方法产生竞争性的实验结果。在无限多个状态的连续控制问题中,我们的价值函数通过同时学习一小部分“探测状态”和从探测状态在策略返回中产生的动作的映射来最大程度地减少其预测错误。该方法以极少数状态的形式提取有关环境的重要抽象知识,足以完全指定许多政策的行为。策略仅通过改变探测状态的动作,遵循值函数的预测的梯度来改善。令人惊讶的是,只有通过分别知道如何在3和5的5个这样的国家中采取行动,才有可能克隆在游泳者V3和Hopper-V3环境中近乎最佳政策的行为。值得注意的是,我们经过评估NN策略的培训的价值功能也与政策体系结构的变化也不变:我们表明,它允许零拍学习线性策略的竞争力与培训中最佳政策竞争。我们的代码是公开的。
translated by 谷歌翻译
近年来,随着深度神经网络方法的普及,手术计算机视觉领域经历了相当大的突破。但是,用于培训的标准全面监督方法需要大量的带注释的数据,从而实现高昂的成本;特别是在临床领域。已经开始在一般计算机视觉社区中获得吸引力的自我监督学习(SSL)方法代表了对这些注释成本的潜在解决方案,从而使仅从未标记的数据中学习有用的表示形式。尽管如此,SSL方法在更复杂和有影响力的领域(例如医学和手术)中的有效性仍然有限且未开发。在这项工作中,我们通过在手术计算机视觉的背景下研究了四种最先进的SSL方法(Moco V2,Simclr,Dino,SWAV),以解决这一关键需求。我们对这些方法在cholec80数据集上的性能进行了广泛的分析,以在手术环境理解,相位识别和工具存在检测中为两个基本和流行的任务。我们检查了它们的参数化,然后在半监督设置中相对于训练数据数量的行为。如本工作所述和进行的那样,将这些方法的正确转移到手术中,可以使SSL的一般用途获得可观的性能 - 相位识别率高达7%,而在工具存在检测方面,则具有20% - 半监督相位识别方法高达14%。该代码将在https://github.com/camma-public/selfsupsurg上提供。
translated by 谷歌翻译
最近已被证明扩散模型产生高质量的合成图像,尤其是与指导技术配对,以促进忠诚的多样性。我们探索文本条件图像综合问题的扩散模型,并比较了两种不同的指导策略:剪辑指导和自由分类指导。我们发现后者是人类评估者的优选,用于光敏和标题相似度,并且通常产生光素质拟种样品。使用自由分类指导的35亿参数文本条件扩散模型的样本由人类评估者对来自Dall-E的人的人们青睐,即使后者使用昂贵的剪辑重新划分。此外,我们发现我们的模型可以进行微调,以执行图像修复,从而实现强大的文本驱动的图像编辑。我们在过滤的数据集中培训较小的模型,并在https://github.com/openai/glide-text2im释放代码和权重。
translated by 谷歌翻译
State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This restricted form of supervision limits their generality and usability since additional labeled data is needed to specify any other visual concept. Learning directly from raw text about images is a promising alternative which leverages a much broader source of supervision. We demonstrate that the simple pre-training task of predicting which caption goes with which image is an efficient and scalable way to learn SOTA image representations from scratch on a dataset of 400 million (image, text) pairs collected from the internet. After pre-training, natural language is used to reference learned visual concepts (or describe new ones) enabling zero-shot transfer of the model to downstream tasks. We study the performance of this approach by benchmarking on over 30 different existing computer vision datasets, spanning tasks such as OCR, action recognition in videos, geo-localization, and many types of fine-grained object classification. The model transfers non-trivially to most tasks and is often competitive with a fully supervised baseline without the need for any dataset specific training. For instance, we match the accuracy of the original ResNet-50 on ImageNet zero-shot without needing to use any of the 1.28 million training examples it was trained on. We release our code and pre-trained model weights at https://github.com/OpenAI/CLIP.
translated by 谷歌翻译
Computer tomography (CT) have been routinely used for the diagnosis of lung diseases and recently, during the pandemic, for detecting the infectivity and severity of COVID-19 disease. One of the major concerns in using ma-chine learning (ML) approaches for automatic processing of CT scan images in clinical setting is that these methods are trained on limited and biased sub-sets of publicly available COVID-19 data. This has raised concerns regarding the generalizability of these models on external datasets, not seen by the model during training. To address some of these issues, in this work CT scan images from confirmed COVID-19 data obtained from one of the largest public repositories, COVIDx CT 2A were used for training and internal vali-dation of machine learning models. For the external validation we generated Indian-COVID-19 CT dataset, an open-source repository containing 3D CT volumes and 12096 chest CT images from 288 COVID-19 patients from In-dia. Comparative performance evaluation of four state-of-the-art machine learning models, viz., a lightweight convolutional neural network (CNN), and three other CNN based deep learning (DL) models such as VGG-16, ResNet-50 and Inception-v3 in classifying CT images into three classes, viz., normal, non-covid pneumonia, and COVID-19 is carried out on these two datasets. Our analysis showed that the performance of all the models is comparable on the hold-out COVIDx CT 2A test set with 90% - 99% accuracies (96% for CNN), while on the external Indian-COVID-19 CT dataset a drop in the performance is observed for all the models (8% - 19%). The traditional ma-chine learning model, CNN performed the best on the external dataset (accu-racy 88%) in comparison to the deep learning models, indicating that a light-weight CNN is better generalizable on unseen data. The data and code are made available at https://github.com/aleesuss/c19.
translated by 谷歌翻译